On the Inefficient Use of Entropy for Anomaly Detection
نویسندگان
چکیده
Entropy-based measures have been widely deployed in anomaly detection systems (ADSes) to quantify behavioral patterns [1]. The entropy measure has shown significant promise in detecting diverse set of anomalies present in networks and end-hosts. We argue that the full potential of entropy-based anomaly detection is currently not being exploited because of its inefficient use. In support of this argument, we highlight three important shortcomings of existing entropy-based ADSes. We then propose efficient entropy usage – supported by preliminary evaluations – to mitigate these shortcomings. 1 Entropy Limitations and Countermeasures 1.1 Feature correlation should be retained Current ADSes perform entropy analysis on marginal distributions of features. In general, significant correlation exists across traffic and/or host features which is not being leveraged by these ADSes. As a proof-of-concept example, we propose to detect malicious network sessions by noting that the histogram of keystrokes which are used to initiate network sessions is skewed [see Fig. 1(a)] and perturbation in this metric can easily reveal the presence of an anomaly; network traffic and keystroke data were collected before and after infecting a human-operated computer with the low-rate Rbot-AQJ worm. While analyzing the entropies of the marginal keystroke distribution and/or the marginal session distribution is clearly not useful, Fig. 1(b) shows that quantifying these features using joint (session-keystroke) entropy can easily detect anomalous activity. 1.2 Spatial/temporal correlation should be retained Another limitation of the entropy measure is its inability to take spatial/temporal correlation of benign patterns into account. Such correlations can prove useful in the detection of subtle anomalies. For instance, Fig. 1(c) shows the blockwise (block size = 1KB) entropy of a PDF file which is infected by an embedded executable malware. It is evident that entropy is unable to provide clear perturbations required for detection. On the other hand, entropy rate [Fig. 1(d)], which models and accounts for the spatial/temporal correlation, provides very clear perturbations at the infected file blocks; entropy rate quantifies the average entropy of conditional distributions [2]. 1 13 40 63 9 32 65 2 162 38 0 0.2 0.4 0.6 0.8 Virtual Key Code N o rm a liz e d F re q (a) Histogram of session-keystrokes 200 400 600 80
منابع مشابه
Improving the RX Anomaly Detection Algorithm for Hyperspectral Images using FFT
Anomaly Detection (AD) has recently become an important application of target detection in hyperspectral images. The Reed-Xialoi (RX) is the most widely used AD algorithm that suffers from “small sample size” problem. The best solution for this problem is to use Dimensionality Reduction (DR) techniques as a pre-processing step for RX detector. Using this method not only improves the detection p...
متن کاملDynamic anomaly detection by using incremental approximate PCA in AODV-based MANETs
Mobile Ad-hoc Networks (MANETs) by contrast of other networks have more vulnerability because of having nature properties such as dynamic topology and no infrastructure. Therefore, a considerable challenge for these networks, is a method expansion that to be able to specify anomalies with high accuracy at network dynamic topology alternation. In this paper, two methods proposed for dynamic anom...
متن کاملA Hybrid Framework for Building an Efficient Incremental Intrusion Detection System
In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...
متن کاملInformation-Theoretic Measures for Anomaly Detection
Anomaly detection is an essential component of the protection mechanisms against novel attacks. In this paper, we propose to use several information-theoretic measures, namely, entropy, conditional entropy, relative conditional entropy, information gain, and information cost for anomaly detection. These measures can be used to describe the characteristics of an audit data set, suggest the appro...
متن کاملSeparation Between Anomalous Targets and Background Based on the Decomposition of Reduced Dimension Hyperspectral Image
The application of anomaly detection has been given a special place among the different processings of hyperspectral images. Nowadays, many of the methods only use background information to detect between anomaly pixels and background. Due to noise and the presence of anomaly pixels in the background, the assumption of the specific statistical distribution of the background, as well as the co...
متن کامل